An Exact Algorithm to Identify Motifs in Orthologous Sequences from Multiple Species

نویسندگان

  • Mathieu Blanchette
  • Benno Schwikowski
  • Martin Tompa
چکیده

The identification of sequence motifs is a fundamental method for suggesting good candidates for biologically functional regions such as promoters, splice sites, binding sites, etc. We investigate the following approach to identifying motifs: given a collection of orthologous sequences from multiple species related by a known phylogenetic tree, search for motifs that are well conserved (according to a parsimony measure) in the species. We present an exact algorithm for solving this problem. We then discuss experimental results on finding promoters of the rbcS gene for a family of 10 plants, on finding promoters of the adh gene for 12 Drosophila species, and on finding promoters of several chloroplast encoded genes.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Discovery of regulatory elements by a computational method for phylogenetic footprinting.

Phylogenetic footprinting is a method for the discovery of regulatory elements in a set of orthologous regulatory regions from multiple species. It does so by identifying the best conserved motifs in those orthologous regions. We describe a computer algorithm designed specifically for this purpose, making use of the phylogenetic relationships among the sequences under study to make more accurat...

متن کامل

identify regulatory motifs.. Bioinformatics. Vol 19:18 (2369-2380)

Identification of regulatory motifs in DNA sequences is made difficult primarily by their degeneracy. Computational techniques to find statistically over-represented sequence profiles are aided by inputting sequences in which motifs are fairly certain to be found. Prudent selection of orthologous genes ensures that species are sufficiently diverged to have low sequence similarities in regions n...

متن کامل

PairMotif: A New Pattern-Driven Algorithm for Planted (l, d) DNA Motif Search

Motif search is a fundamental problem in bioinformatics with an important application in locating transcription factor binding sites (TFBSs) in DNA sequences. The exact algorithms can report all (l, d) motifs and find the best one under a specific objective function. However, it is still a challenging task to identify weak motifs, since either a large amount of memory or execution time is requi...

متن کامل

An integrative and applicable phylogenetic footprinting framework for <i>cis</i>-regulatory motifs identification in prokaryotic genomes

Background: Phylogenetic footprinting is an important computational technique for identifying cis-regulatory motifs in orthologous regulatory regions from multiple genomes, as motifs tend to evolve slower than their surrounding non-functional sequences. Its application, however, has several difficulties for optimizing the selection of orthologous data and reducing the false positives in motif p...

متن کامل

Identify SSR Regulators for Functional Gene Sets through Cross-Species Comparison

Single sequence repeats (SSRs) are DNA sequences composed of tandem repetitions of relatively short motifs. They are not only considered as genetic markers but also play an important role in gene regulatory networks, One of the greatest challenges of functional genomics. In order to identify key SSR regulators among functional gene sets, we have developed an efficient algorithm for SSR pattern ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Proceedings. International Conference on Intelligent Systems for Molecular Biology

دوره 8  شماره 

صفحات  -

تاریخ انتشار 2000